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Abstract 

This paper considers reconstructing a spectrally sparse signal from a small number of randomly 
observed time-domain samples. The signal of interest is a linear combination of complex sinusoids at 
R distinct frequencies. The frequencies can assume any continuous values in the normalized frequency 
domain [0,1). After converting the spectrally sparse signal recovery into a low rank structured matrix 
completion problem, we propose an efficient feasible point approach, named projected Wirtinger gradient 
descent (PWGD) algorithm, to efficiently solve this structured matrix completion problem. We further 
accelerate our proposed algorithm by a scheme inspired by FISTA. We give the convergence analysis of 
our proposed algorithms. Extensive numerical experiments are provided to illustrate the efficiency of our 
proposed algorithm. Different from earlier approaches, our algorithm can solve problems of very large 
dimensions very efficiently. 


1 Introduction 

Reconstructing a signal from a series of sampling measurements is a common theme in signal processing, 
which has numerous practical applications in radar, sonar, array processing, wireless communication, seis¬ 
mology, fluorescence microscopy, etc. Because of the constraints imposed by sampling hardware and physical 
measurement conditions, sometimes we can only obtain partial information, instead of full information, of 
a signal. For example, when we try to infer the frequency components of a signal, we may only be able to 
get a small number of discrete time-domain samples of this signal. In this paper, the signal of interest is 
a weighted sum of l-dimensional(l-D) complex sinusoids at R distinct continuous frequencies in the unit 
interval. From a small number of time-domain samples of the superposition of R sinusoids, we are interested 
in recovering the complete signals, and identifying the existing frequencies. This signal model covers signals 
in various applications, for example, in acceleration of medical imaging [2dj . analog-to-digital conversion m, 
and inverse scattering in seismic imaging [5] . 

Early conventional approaches, such as Prony’s method [26] , ESPRIT [25] , and the matrix pencil method 
EH, use sampling rates satisfying the Nyquist-Shannon sampling theorem. Compressed sensing (CS) is 
a new line of work in signal reconstruction, where, if the signal is sparse over some transform domain, 
the signal may be reconstructed with even fewer samples than the Nyquist sampling theorem requires m 
di]. In conventional compressed sensing, the signal of interest is generally assumed to have a sparse or 
approximately sparse representation over a finite discrete dictionary. However, signal parameters in practical 
applications often take values in a continuous domain. For example, in the problem considered in this paper, 
the frequencies take values in [0,1). One can discretize the continuous signal parameters to a finite set of 
equi-spaced points, and then apply the theory of CS to recover the discretized parameters. However, when 
the discretization is not fine enough, this will cause basis mismatch ES] in signal recovery. In basis mismatch, 
we will have non-negligible signal recovery errors resulting from the impact of discretization errors on CS 
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signal recovery procedures, unless we make grid discretization very fine, leading to an undesirably large 
dictionary for signal recovery, to reduce signal recovery error |28j . 

Recently there have been growing interests in designing new algorithms which can recover the continuous¬ 
valued parameters precisely even from a small number of discrete nonuniform time samples. In the au¬ 
thors proposed to use total variation minimization to find the continuous-valued frequencies from equi-spaced 
samples. In HZj, motivated by atomic norm minimization |13] . the authors used atomic norm minimization 
to recover signal frequencies from nonuniform samples. In m and 113, the authors convert the signal 
frequency recovery into a low-rank Toeplitz matrix completion problem. In [14) . the problem of recovering 
signal frequencies from nonuniform samples is formulated as a low-rank Hankel matrix completion problem, 
inspired by Prony’s method and the matrix pencil method. Though robust signal recovery is guaranteed 
theoretically through these methods in miiiTiiiii, convex optimization based low-rank structured matrix 
completions are not computationally efficient- the resulting optimization problems contain O(iV^) unknowns 
explicitly, where N is the dimension of signal. To solve the resulting matrix completion problems, off-the- 
shelf algorithms such as SDPT3 [29] use interior point methods which requires computing a Hessian matrix 
of size O(fV^) in its Newton step. First-order methods, such as alternating direction method of multipliers 
(ADMM) and proximal point algorithm (PPA), need a dual matrix that is unstructured [TH|, and, conse¬ 
quently, these algorithms require memory of size 0{N^). Therefore, these convex optimization approaches 
are not suitable for recovering signals of large dimensions. 

To efficiently recover high-dimensional signals, this paper proposes a projected Wirtinger gradient descent 
(PWGD) method for low-rank Hankel matrix completion. Instead of solving a convex relaxation of the 
low rank Hankel matrix completion problem, we directly deal with the non-convex low rank structured 
matrix completion problem. Our proposed PWGD algorithm is a feasible point algorithm, and it uses 
0{NR) memory. Since the number of sinusoids, i?, is usually much smaller than N, the proposed algorithm 
provides efficient large scale signal recovery. Global convergence analysis of our algorithm is provided based 
upon Attouch and Bolte’s theory mm- To speed up our proposed algorithm, an acceleration technique 
scheme similar to FISTA |2] is given. The practical applicability of our algorithm is validated by numerical 
experiments, which show our algorithms can recover high-dimensional signals as a superposition of multiple 
sinusoids. 

The paper is organized as follows. In Sectionj^ we describe our signal model, give essential concepts about 
Hankel matrix, and formulate the signal recovery problem. Our iterative algorithm and related convergence 
analysis is present in Section 3, where we also propose ways to accelerate the convergence of our algorithm. 
In Section 4, some numerical experiments are provided to demonstrate the performance of our algorithm. 
We then conclude our paper with a discussion of future work. 


2 Problem formulation 

In this section, we give some preliminaries on our signal model and the formulation of the signal reconstruction 
problem considered in this paper. 

2.1 Signal model 

The signal of our interest t S K, is assumed as a linear 

distinct frequencies G [0,1) for 1 < /c < R, i.e., 

where i = \/—l. 

Here the frequencies are normalized to be in [0,1) so that the signal can be uniquely determined by 

its time domain samples at integer points, and the associated coefficients are the complex amplitudes. 


combination of complex sinusoids at R 


t > 0, 
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This model covers a wide range of signals in wireless communication, biology, automation, imaging science, 
seismology, etc. 

To reconstruct the signal early methods (e.g. Prony’s method, the matrix pencil method, 

MUSIC) need time domain samples on uniformly sampled integer time points. More specifically, they use 
the following (2N — 1) samples in the time domain a;!*™®) (t) at f = 0,1,..., 2N — 2; and then, in order to get 
the frequencies of these early methods used linear algebra techniques involving linear structured 

matrices such as Hankel and Toeplitz matrices. However, due to physical measurement limitations, it is 
usually hard to get all the 2N — 1 samples of t = 0,1,..., 2N — 1, especially for signals with very 

high frequencies (before normalization) [30] . So in this paper, we will consider non-uniform sampling in the 
time domain. We denote the underlying uniformly-sampled true signal as 

a.(true) ^ . . . , - 2)]^ € 

where iV is a large integer. However, we consider the case where only M {M < 2N — 1) entries of are 

observed. In this way, the sampling rate is significantly reduced. The same signal model is also considered 

in [HllTlIIl]. 

2.2 Existing Algorithms 

Let 0 C {0,1,..., 2N — 2} be the set of indices of observed entries of . Our goal is to reconstruct the 

true vector from 

y = :={,c(‘-'-)(t) I tee}. ( 1 ) 

There are several existing algorithms in the literature for recovering the R sinusoids from the incomplete 
observations of 

One can discretize the frequency domain [0,1) by uniform grid Q with meshsize I/(2A'^ — I). Assume 
all frequencies k = 1,..., i?, are on the grid Q. Then, the discrete signal a;0''“®) can be written as 

^.(true) _ p* where F* is the inverse of the discrete Fourier transform (DFT) matrix of order 2N — 1, and 
c G is a sparse vector with non-zero entries at indices (2N — l)/fc’s. Then, the samples 0 can be 

written as y — FqC, where are partial rows of F*. Equivalently, our goal has turned into recovering the 
sparse vector c. According to the theory of compressed sensing |I2j . when 0 is uniformly randomly drawn 
from all subsets of {0,1,..., 2N — 2} with cardinality M, the sparse vector c (hence can be recovered 

exactly with high probability by solving 


min||c||i s.t. Fec = y, (2) 

C 

provided M > 0{K log N). Efficient algorithms for solving ([^ include Bregman iterations piziini] and 
iterative soft-thresholding algorithms mm- When the frequencies are not on the grid Q, we expect 

to have a good approximation of by solving ([^, as the differences between the true frequencies and 

the grid Q can be as small as 0(1/A^). Unfortunately, this discretization method can lead to large recovery 
errors m- This phenomena is known as basis mismatch of compressed sensing. To overcome this limitation, 
we will consider frequencies on the continuous domain [0,1) instead of discretizing it with a meshsize 

of 1/{2N-1). 

Super-resolution compressed sensing m and off-the-grid compressed sensing m consider reconstructing 

^(tru®) 

0 under the assumption that the frequencies take continuous values in the domain 

[0,1). The authors of [11] and [27] proposed to recover by solving 


min 

U,X,t 


Uo 

2{2N-1) 


1 

2 ^’ 


Tiu) 

X* 


h 0 , 


(3) 


where T is a linear operator that maps a vector u G c4Af-3 .j-g ^27V — 1) x (2N — 1) Toeplitz matrix T{u) 
satisfying [T{u)]jk = Uj-k for all 0 < j, k < 2N — 2. It was proved that, when 0 is uniformly randomly 
drawn from all the subsets of {0,1,..., 2N — 2} with cardinality M > 0{Rlog{R/S) \og{N/S)), the solution 
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of ([^ will match the true discrete signal with probability at least 1 — Extensions of the atomic 

norm minimization to 2D or higher dimensional complex exponentials can be found in |31[ I24j . 

Enhanced matrix completion [14] is another method that is able to reconstruct signals with frequencies 
taking continuous values. Enhanced matrix completion method converts the signal recovery problem to a 
Hankel matrix completion problem. Since this method is closely related to our proposed algorithm, we will 
introduce it in detail in the next subsection. 


2.3 Hankel Matrix Completion 


The enhanced matrix completion in m converts the reconstruction of from 0 to a Hankel matrix 

completion problem. Let H he a linear operator that maps a vector in to a iV x TV Hankel matrix as 

follows 


■H : a; e ^-Ha; e [nx]jk = Xj+k, 0<j,k<N-l. 


Define iTdfue) _ It can be checked that the rank of is R, due to the following factorization 




1 


1 


g2«/ 


(true) 

R 


ditrue) 




g27r*(M-l)/f"“=> ^ ^ g27ri(M-l)/<:;‘™> 


.(true) 


1 


(true) 


g27rr(M-l)/)j‘™> 


Then, instead of constructing the true signal directly, we reconstruct the rank-i? Hankel matrix 

'Ha;(t™e)_ Since R is one-to-one from a vector in to an x iV Hankel matrix, one can easily convert 

the reconstructed Hankel matrix back to a signal. 

Now the signal reconstruction problem is formulated as 


Find matrix X 

subject to rank(X) < i?, 

Xjk=Hfr\ {hk)€n, 

X is a Hankel matrix. 


(4) 


where D = {(j, fc) \ j + k & 0} is the positions of known entries in Since R is one-to-one from 

(j^2Ar-i |.Q gg|. all tv X Hankel matrix, reconstructing is equivalent to reconstructing H'l*’'™). 

Following generic low-rank matrix completion m, 0 is converted in m to a rank minimization problem 
and further relaxed to 

min ||X||* s.t. (a G D, and X is Hankel. (5) 

Here || • ||* is the sum of all the singular values, namely the nuclear norm. It was shown that, if 0 is 
uniformly randomly drawn from all subsets of {0,1,... ,2N — 2} with cardinality M > 0{R\og^ N), and 
certain separation conditions between frequencies are satisfied, then the solution of 0 recover _ff(true) 
perfectly with dominant probability. Similar models are considered in [^. 

Though ([^ is a convex optimization problem, there were no efficient ways to compute it for large problem 
dimensions. It has 0{N^) explicit unknowns instead of 0{N) in One may convert 0 to an SDP 

and then employ available packages such as SDPT3 j^Sj. However, these packages use second-order methods, 
which require solving a huge linear system of order 0{N‘^) xO{N'^) at each step. Also, it is not straightforward 
m to adapt nuclear norm minimization algorithms (e.g. 0 ) for generic low-rank matrix completion to 
solving 0, as the Hankel constraint invokes 0{N‘^) linear equality constraints. The semidefnite programming 
for atomic norm minimization 0 suffers from the same issue of high computational complexity. 

In this paper, instead of considering convex optimizations 0 and we aim at attacking the original 
non-convex problem 0 directly. Non-convex algorithms has been proven to have the advantage of fast 
convergence in sparsity and low-rank reconstruction ima. We propose an efficient algorithm based on 
projected Wirtinger gradient descent for this particular spectral signal recovery problem. 
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3 Projected Wirtinger Gradient Algorithm 

In this section, we present our projected Wirtinger gradient algorithm, prove its convergence, and provide 
an acceleration scheme. Our basic algorithm is a projected gradient flow in the Wirtinger sense, and its 
convergence is obtained by applying the framework in [1] for proximal alternating minimization. To accelerate 
the convergence, we use the strategy used in FISTA [5]. 


3.1 Basic algorithm 

This section is devoted to presenting our basic algorithm for solving Q. Let us define the set of all complex¬ 
valued matrices with rank no greater than R as 

= C^^^lrank(X) < R}. ( 6 ) 


Similarly, define the set of all complex-valued Hankel matrices that are consistent with the observed data 

= {Hx I X G xq = (7) 

The set is a smooth manifold and is an afline space. Then, our signal recovery problem and also @ 
can be formulated as the following optimization problem 


min “11-^ 


H 


( 8 ) 


We will employ a projected gradient descent algorithm to solve (§. The objective F(il, ff) := i||i—iT|||. 
is a real-valued function with complex variables, which is not differentiable in the ordinary complex calculus 
sense. Nevertheless, F{L,H) is differentiable with respect to the real and imaginary parts of its variables. 
Thus, our gradient flow is performed on the real and imaginary parts respectively. Denote 


Z = 


L 

H 


= JR -I- 


where 3R and 5 are the real and imaginary parts of Z. Rewrite F as A(3R, 5). Then, in our gradient flow 
algorithm, 3R is updated by and 5 by In other words, Z is updated by -I- By Wirtinger 
calculus m, we have the relation 


OF OF OF 

-h i — = 2^=. 

asj dz 


Direct calculations give 



' odF' 

az 


L-H 


— 

H-L 

L dHj 




Using the Wirtinger gradient, our proposed algorithm is given as follows: at iteration t, we have 


\ Ht+i G - 62 {Ht - it+i)). 


where > 0 and (52 > 0 are step sizes, and and Vje are projections onto and respectively. We 
call projected Wirtinger gradient descent (PWGD). 

It remains to find out V^r and respectively. Since V^r{X) is the best rank-i? approximation to 
X, according to Eckhart-Young Theorem [20], 


V^r{X) = Ur^rV^, 

where the columns of Ur and Vr are the first R left and right singular vectors of X respectively and Hr 
is a diagonal matrix with diagonals corresponding singular values. The closed form of is given by the 
following lemma 
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Lemma 1. We have 


Vjif{X) — Hz, where Zj 


meanjXfci | k + l = j}, 


ifj e 0, 

otherwise. 


( 10 ) 


Proof. V^{X) is the solution of the following least square problem 


Vj^{X) = a,T:gim'n{\\Z - X\\p : X ^ = H ■ aYgmm{\\Hz - X\\p : 2:0 = 


2N-2 


= n ■ argminj ^ ^ {zj - Xu f : zq = 

j=0 k+l=j 


It is obvious that the solution of the optimization problem in the last line is given by Zj in (10). 


□ 


The proposed PWGD algorithm (|^ is a feasible point algorithm. The iterates L* and Ht are always in 
their feasible sets and Jtf respectively. This property can significantly reduce the computational cost 
and storage, when R is small compared to N. Since L* G it is stored in a factorization form and only 
0{NR) memory is necessary. Also, the Hankel matrix Hf can be represented by its parameters, which is of 
size only 0{N). Furthermore, in Step 1 of ([^, it needs to compute only the first R singular values and their 
corresponding singular vectors of Lt — Si{Lt — Ht) in the computation of the projection. This can be done 
by, e.g., Krylov subspace methods, which invokes only the matrix-vector product of Lt — Si{Lt — Ht). For 
the matrix-vector product of Lt, since Lt is rank R and in a factorization form, it can be done in 0{NR) 
operations. The matrix-vector product of the Hankel matrix Ht is implemented by fast Fourier transform 
PP] . which needs only 0{N log N) operations. Step 2 of ([^ needs averages of Lt+i along anti-diagonals. 


3.2 Convergence 

In this subsection, we prove the convergence of the proposed PWGD algorithm <§• Our proof is achieved 
by applying the convergence result in [T]. 

Gonsider a general non-convex optimization problem 


min'tp{x,y) := (j){x,y) + 6»(x) Hw{y), 

x,y 


( 11 ) 


where the functions 9 : R" 1 —)■ K U {+ 00 } and oj : K™ 1 —)■ K U {+ 00 } are proper lower semicontinuous 
functions and (f : R" x R™ 1 —>■ R is a function. It was proposed in [T] a proximal alternating minimization 
algorithm for solving 0 


Xk+i G argmin^gR„ i’{x,yk) + - Xk\\l, 

yk+i G argmmy^^,^ fj{xk+i,y) + ^\\y - ykWl- 


( 12 ) 


Under the assumption that the function ip satisfies the so-called Kurdyka-Lojasiewicz (KL) condition and 
\/(f) is Lipschitz on bounded sets, [1] proved the convergence of (12). Generally, the KL condition is not easy 
to check. A sufficient condition to guarantee the KL condition is the semi-algebraic property. A proper and 
lower semi-continuous function is called semi-algebraic if its graph is a semi-algebraic set. Recall a subset 
S' C R^^ is a real semi-algebraic set if there exists a finite number of real polynomial function gtj, hij : R^^ 1 — >■ R 
such that 

p g 

S = IJ Pi {it G R^' I gij{u) = 0 , hij (it) < 0 } . 

j=ii=i 

Ghoose 6 = 6c and uj = 5d are indicator functions for the sets C G R" and D G 

0, if X G C, 


respectively. Recall 


the indicator function 5c of a set C is defined as 5c{x) = 


-1-00, if X ^ C 


■ Let (j){x,y) = ^\\x-y\\l. Then 
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( 13 ) 


(12) becomes an alternating projection algorithm 

Xk-\-i G Vc (xk - T^ixk - 2/fc)) , 

?/fc+i ^ 'Pd yVk i^rj^iUk • 

The results in [1] imply the following convergence theorem of (13), which is a corollary of Corollary 12 of [I] 
and Theorem 3 and Example 2 of [4]. 

Theorem 1. Assume that the sets C C R" and D C R™ are semi-algebraic. Let {xi~,yk) be generated by 

(13) with 0 < a < Sk, < b for all k. 

(a) Either \\{xk,yk )\\2 —t oo as k ^ oo, or {xk,yk) converges to a critical point of if. 


(b) If we further assume {xQ,yo) is feasible and sufficiently close to a global minimizer of if, then (a;o,yo) 
converges to a global minimizer of if. 

Next we apply Theorem to the PWGD algorithm © to get its convergence. The PWGD algorithm © 
is in the same form as However, our PWGD algorithm is performed in complex-valued matrix spaces, 
while the setting of Theorem is in real. Nevertheless, we can identify any complex-valued matrix to a 
real one by concatenating its real and imaginary parts. Actually, as aforementioned, our Writinger gradient 
descent is exactly obtained in this way by considering the gradient with respect to the real and imaginary 
parts. Since the objective function F{L, H) in ([^ does not change after this identification, we only to check 
the sets 1%^ in ([^ and in ([^ are semi-algebraic when viewed as sets of real and imaginary parts. This 
is done by the following two lemmas. 

Lemma 2. The set defined as follows is a semi-algebraic set 

s^r = {[x,y] I (x + *y) G^«}. 


Proof. Denote 
and 


= {[X, Y]\X,Y & rank(X -b lY) = r) 

^, = 1 [x,r] I e 


^NxN 


X -Y 
Y X 


= 2r 


We first prove by showing C and C respectively. Let [-X',1^] G and 

a singular value decomposition (SVD) oi X -\- lY is {X -\- lY) = {Ure + *C^im)S(\die + where 

[X —Y 

^rxr^ Then, by direct calculation, we see that an SVD of 


^Rej ^Imj ^ 

is given by 


and S G 


X -Y 


'C/ro 

-Ui^ 

s 

o' 

'VRe 

-Vim' 

Y X 


Ui^ 

Ure . 

0 

s 


VRe _ 


X 


(14) 


Therefore, rank 
[X,Y] G If 


X 

-Y 

Y 

X 

( 

Ml 


r’ 

M2 

5 


= 2r, which implies {X,Y] G and further C Gonversely, let 


is a singular triplet of 


X -Y 
Y X 


, then tj. 


-U2 

Ml 


direct calculation. Therefore, the multiplicity of each singular value is even, and SVD’s of 


-V2 
Vl 

X -Y 
Y X 


is too by 
must 


be in the form of ( [14| . Gonsequently, {X -j- %Y) = (LIrb + *I^im)S(VRe + is an SVD of {X -\- lY), 

which implies rank(X -b lY) = r. Therefore, [IV, V] G Thus, C t^r- 

Since is the intersection of the set of all rank-2r real-valued matrices and the linear subspace of 

X — 

, it is deducted from [U Example 2] that is a semi-algebraic set. This 


matrices in the form of ^ 

together with implies is a semi-algebraic set too. 

Finally, it is obvious that IYr = IJ^o Therefore, IYr is a semi-algebraic set. 


□ 
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Lemma 3. The set Jtf defined as follows is a semi-algebraic set 


Jtr={[X,Y] I (X-h zY) e . 

Proof. Since H is a linear operator, 

TLx = VM{x) + iTL'^{x). 

Futher, for any x satisfying xq = we have 3fi(a:0) = and $5(a;e) = 3 ( 0 : 0 '^“'^^). Therefore, 

jr = + iQ{jr) = jei+ije2, 


where 


JtT\ = {"Hr I r e re = 3fi(a;^™^)}, ^ = {Hi \ i G ie = 

This shows = JY\ x JYi. Since both JY\ are JYi are affine spaces, their product JY is also, which implies 
JY is semi-algebraic. □ 

Combining Theorem and Lemmas and leads to the following convergence results of the proposed 
algorithm 

Theorem 2. Let {Lt,Ht) be generated by with 0 < di,(52 < 1. 

(a) Either \\{Lt,Ht)\\ F —t 00 as t ^ 00 , or {Lt, Ht) converges. 

(b) If we further assume {Lq, Hq) is feasible and sufficiently close to a global minimizer of {^WL—H\\{\L G 

G then {Lo,Ho) converges to a global minimizer 0 /min^g^H ^jg^ i||L — iT||F. 

We would like to remark that the unboundedness in Q(a) is not a problem and can be overcome by 
introduce a bound constraint in the set Jif. For example, we can define Jif = Jifn{X \ ||X|joo < B} with B 
a very large number, and ([^ is slightly modified by replacing JY by ,Jif. Then all the conditions in Theorem 
[^can still be verified. Since \\{Lk, Hk)\\F 00 , we must have {Lk,Hk) converges. 


3.3 Acceleration by a FISTA Scheme 


In this subsection, we propose a scheme to accelerate the convergence of the PWGD algorithm Q. Our 
scheme borrows from the fast iterative shrinkage-thresholding algorithm (FISTA) [5] , which has been proven 
to be efficient in minimizing the sum of two convex functions with one having a Lipschitz continuous gradient. 
The basic idea is to use a specific linear combination of two successive iterates. Although our problem is 
non-convex, we still employ the linear combination scheme in FISTA for our model. 

Our PWGD with FISTA scheme, called PWGD-FISTA, is constructed as follows: Given ko = 1, we 
generate {Lt,Ht} hy 

Lt+i G V^R{Lt — 5i{Lt — Ht)), 

Ht+I G Vj^iHt - S2{Ht - Bt+i)), 

, _ ffii+Yf+i ( 15 ) 

i^t+i — 2 ’ 

iT*+i = Jft+i + - FfO 


Since JY is an affine subspace, the linear combination in the last line of (15) does not change the feasibility 
of Ht+i, i.e., Ht+i G JY. This guarantees that the computational complexity and required storage of Step 
1 and Step 2 in the PWGD-FISTA algorithm are the same as that in the PWGD algorithm §. Also, the 
computational effort in Step 3 and Step 4 of (15) is negligible compared with that in Step 1 and Step 2. 
Therefore, the PWGD-FISTA algorithm preserves the computational simplicity of the PWGD algorithm. 
As we will see in the numerical experiments section, the PWGD-FISTA Algorithm converges faster than the 
PWGD algorithm. 






4 Numerical experiments 

In this section, we use numerical experiments to demonstrate the effectiveness and efficiency of our proposed 
algorithms. 

4.1 Phase Transition 

We first illustrate that our proposed algorithm is able to recovery spectrally sparse signals from their very 
limited time domain samples. We fix the dimension of the signal to be 127 (i.e. N = 64), and we vary the 
sparsity R and the number of samples M. For each {R, M) pair, 100 Monte Carlo trials were conducted. For 
each trial, the true signal is synthesized by randomly generating the true frequencies ’s and magnitudes 

which are independently uniformly distributed on [0,1) and the unit circle respectively. We then 
get M samples uniformly at random. The PWGD is executed by setting the parameters 6 i = 82 = 0.9999, 
and we stop the algorithm when ||iTt+i — Ht\\F/\\Ht\\F < 10“^. Signal recovery for each trial is considered 
successful if the relative error satisfies ||i — || 2 /||£c(‘™®^|j 2 < 5 x 10“^, where x denotes the solution 

returned by the PWGD. Figure [^illustrates the results of the Morte Carlo experiments. Here the horizontal 
axis corresponds to the number M of the samples (i.e. the size of the observation location set 0), while the 
vertical axis corresponds to the sparsity level R. The empirical success rate is reflected by the color of each 
cell. It can be seen from the figure that our proposed algorithm has a high rate of successful recovery if M 
exceeds than certain thresholds for a given R. 


Numerical performance 



t1:the number of samples 


Figure 1: Phase transition for successful recovery rate, where frequency locations are randomly generated. 
The horizontal axis stands for the number of samples, and the vertical axis represents the sparsity R. The 
demonstrated empirical success rate is calculated by averaging over 100 Monte Garlo trials. 

4.2 Signals of large dimension 

Next we demonstrate that our proposed algorithm is able to recover signals of large scale, and compare it 
with the Enhance Matrix Gompletion (EMaG) in [14]. As we have argued, different from existing convex 
optimization based methods such as EMaG, our proposed algorithm is able to work with high-dimensional 
spectrally sparse signals. In Table [^ the elapsed time for signals of different dimensions are listed. For our 
algorithm, we use the same settings as in the previous section. For EMaG algorithm, we used GVX software 
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to solve it. From the table, we can see that PWGD can greatly speed up the the signal recovery for moderate 
dimensions and also work well for signals of high dimensions. 



PWGD 

EMaG 

the signal with N = 51, A = 1, M = 10 

0.34 

46.8 

the signal with N = 51, R = 3, M = 20 

0.46 

58.0 

the signal with N = 101, R = 5,M = 40 

0.95 

out of memory 

the signal with N = 501, R = 5,M = 100 

12.7 

out of memory 

the signal with N = 2501, A = 13, M = 500 

133 

out of memory 

the signal with N — 2501, A = 25, M = 1000 

91.4 

out of memory 

the signal with N = 5001, A = 20, M = 1000 

645 

out of memory 

the signal with N = 5001, A = 31, M = 2000 

403 

out of memory 


Table 1: Elapsed time in seconds for signals of different dimensions. 


4.3 Acceleration by a FISTA-like Scheme 

Figure depicts the convergence curves of PWGD and PWGD-FISTA. We see clearly that the PWGD- 
FISTA Algorithm converges faster than the PWGD algorithm. Roughly, the PWGD-FISTA needs only 2/3 
number of iterations that PWGD requires to get solutions of the same accuracy. 




(a) N = 501, A = 8, M = 200 (b) N = 5001, A = 20, M = 1000 


Figure 2: Gonvergence rate comparison of PWGD and PWGD-FISTA . 


5 Conclusion 

In this paper, a fast iterative algorithm is proposed for recovering spectrally sparse signals whose frequencies 
can be any values in the continuous domain [0,1) from a small amount of time domain samples. Different 
from existing algorithms, our proposed algorithm is able to deal with signals of large dimension. Inspired 
by the scheme in FISTA, we also provided an acceleration of the proposed algorithm. In the future, we will 
extend our algorithms to signal recovery from noisy samples and signals with multivariate frequencies. 
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